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Abstract. — The analysis of ratios of body measurements is deeply ingrained in the taxonomic literature. Whether for plants 
or animals, certain ratios are commonly indicated in identification keys, diagnoses, and descriptions. They often provide the 
only means for separation of cryptic species that mostly lack distinguishing qualitative characters. Additionally, they pro- 
vide an obvious way to study differences in body proportions, as ratios reflect geometric shape differences. However, when 
it comes to multivariate analysis of body measurements, for instance, with linear discriminant analysis (LDA) or principal 
component analysis (PCA), interpretation using body ratios is difficult. Both techniques are commonly applied for separat- 
ing similar taxa or for exploring the structure of variation, respectively, and require standardized raw or log-transformed 
variables as input. Here, we develop statistical procedures for the analysis of body ratios in a consistent multivariate statis- 
tical framework. In particular, we present algorithms adapted to LDA and PCA that allow the interpretation of numerical 
results in terms of body proportions. We first introduce a method called the "LDA ratio extractor," which reveals the best 
ratios for separation of two or more groups with the help of discriminant analysis. We also provide measures for deciding 
how much of the total differences between individuals or groups of individuals is due to size and how much is due to 
shape. The second method, a graphical tool called the "PCA ratio spectrum," aims at the interpretation of principal compo- 
nents in terms of body ratios. Based on a similar idea, the "allometry ratio spectrum" is developed which can be used for 
studying the allometric behavior of ratios. Because size can be defined in different ways, we discuss several concepts of size. 
Central to this discussion is Jolicoeur's multivariate generalization of the allometry equation, a concept that was derived 
only with a heuristic argument. Here we present a statistical derivation of the allometric size vector using the method of 
least squares. The application of the above methods is extensively demonstrated using published data sets from parasitic 
wasps and rock crabs. [Allometry; Chalcidoidea; Hymenoptera; LDA ratio extractor; morphometry; multivariate statistics; 
PCA ratio spectrum.] 



The use of ratios of measurements (i.e., of body pro- 
portions), has a long tradition and is deeply ingrained in 
morphometric taxonomy (Reyment et al. 1984; Winston 
1999; Lestrel 2000; Schuh and Brower 2009). In many 
animal groups, the indication of such ratios is a stan- 
dard of species descriptions, diagnoses, or identification 
keys (Mayr and Ashlock 1991). This is especially true for 
many arthropods, where ratios are a convenient means 
for distinguishing between morphologically similar 
species which often differ significantly in body pro- 
portions but not in qualitative characters. In certain 
insect groups, such as parasitic wasps, numerous ra- 
tios are routinely reported (e.g., Townes and Townes 
1981; Kasparyan 1989; Noyes 2004; Horstmann 2009) 
and sometimes up to 30 ratios form the main body of a 
species description (see, e.g., Graham 1969, 1991). Often 
the use of ratios is rather implicit in descriptive terms, 
for instance, when leaves are described as being "nar- 
row" or "broad," both attributes that could be translated 
into ratios without loss of information. In fact, botanists 
use numerous such terms for various plant parts that 
could be partly or wholly substituted by ratios (Stuessy 
2009). Ratios are also used for phylogenetic analysis 
where they are treated as continuous characters (Thiele 
1993; Wiens 2000; Rae 2002; Goloboff et al. 2006). 

Besides tradition and ease of application, the 
widespread use of ratios is certainly related to a common 
way of looking at the shape of organisms. A taxonomist 
who notices similarity or dissimilarity in proportions 



of two specimens can always adequately translate them 
into a series of ratios. Any two individuals are then rec- 
ognized as having the same shape (i.e., the same body 
proportions), when all measurements differ by a (posi- 
tive) constant factor, for instance, when all of them are 
doubled. It does not matter if a head length to width 
ratio is, say 2 : 4 mm or 4 : 8 mm, as long as the ratio (0.5) 
is the same, the shape (as captured by the ratio) is the 
same. The geometric shape expressed by ratios is thus 
invariant for a particular measure of size (Mosimann 
1970). 

Often it is useful to go one step further and analyze 
more than two linear distances in a single analysis with 
the help of multivariate statistical methods. Over the 
past decades, a wide array of tools has been developed 
in the field of multivariate morphometry (Reyment et al. 
1984; Marcus 1990; Claude 2008). These methods help to 
unravel hidden population structure or to arrive at a bet- 
ter differentiation of groups, in other words, they give 
insights in the multivariate data structure that cannot be 
achieved solely by ratio analysis. Standard applications 
are principal component analysis (PCA) and Fisher's 
linear discriminant analysis (LDA), both with raw data 
(often transformed into logarithmic scale) as the pri- 
mary input (see Pimentel 1979 for a readable account 
for biologists and Sorensen and Foottit 1992 for illustra- 
tive applications in insect systematics). Both methods 
aim to transform the original variables into a new sys- 
tem of coordinate axes, whereby most of the variance 
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is contained in the first two or three axes. Traditionally, 
the results are then presented as scatter plots. However, 
the geometric meaning of these plots differs from the 
one obtained by the analysis of body ratios (Bookstein 
1989; Claude 2008). 

For this reason, we present versions of the classical 
LDA and PCA algorithms that are directly adapted to 
body proportions. In particular, we develop tools that 
allow us to interpret the numerical results obtained by 
these multivariate analyses in terms of the body sizes 
and body proportions of the individuals in question. 
The first method, adapted to LDA and called the "LDA 
ratio extractor," allows the extraction of the ratios that 
are most informative for distinguishing between two or 
more groups. In this context, we also introduce a mea- 
sure for deciding how much of the variation between 
individuals or groups of individuals is due to shape dif- 
ferences and how much is due to size differences. The 
second tool, called the "PCA ratio spectrum," allows the 
interpretation of principal components in terms of ra- 
tios. In a similar manner, the "allometry ratio spectrum" 
can be used to assess the extent of allometric behavior 
in ratios. Furthermore, we present several concepts of 
size and discuss their relation to multivariate allometry 
(Klingenberg 1996). Central to this discussion is allo- 
metric size (Jolicoeur 1963), a concept that was derived 
only heuristically. In the Appendix, we therefore pro- 
vide a statistical derivation of Jolicoeur 's allometric size 
vector using the method of least squares. Finally, the 
above methods are illustrated with a data set from par- 
asitic wasps (Baur 2002) and a classic data set from rock 
crabs (Campbell and Mahon 1974). The former is ideally 
suited for our purpose as ratios are commonly used in 
the taxonomy of these wasps (see above). The latter is 
often used for testing new statistical methods; it is in- 
cluded here because of the strong allometric behavior of 
certain variables. 

The mathematical framework, especially the 
definition of shape and size used in this paper, is 
adopted from the work of Mosimann (1970), Darroch 
and Mosimann (1985), Sampson and Siegel (1985), and 
Rao and Suryawanshi (1996) who has a long and ac- 
knowledged history in morphometry (see, e.g., Pimentel 
1979; Reyment et al. 1984; Marcus 1990; Klingenberg 
1996; Dryden and Mardia 1998; Richtsmeier et al. 2002; 
Claude 2008). The papers of Mosimann (1970) and Dar- 
roch and Mosimann (1985) established the theoretical 
foundation for the use of body ratios in multivariate 
analysis and thus provided an ideal starting point for 
our methods. Sampson and Siegel (1985) and Rao and 
Suryawanshi (1996) were more concerned with partic- 
ular definitions of size and shape. In contrast to these 
authors, our focus is on interpretation of body propor- 
tions rather than mere size and shape. Of course, other 
concepts for the analysis of size and shape (e.g., Cadima 
and Jolliffe 1996; McCoy et al. 2006; Claude 2008; Hotz 
et al. 2010) or the analysis of ratios (e.g., Aitchison 
1986 for compositional data) have been proposed, but 
these are, in our opinion, less suited in our context (see 
below). 



Methodology 

The methods presented below consist of a number of 
steps that are briefly itemized here. The data are first 
standardized and transformed into logarithms, then the 
shape space is defined and a suitable size vector cho- 
sen. Based on these steps, the best ratios for separation 
of groups are extracted using a new algorithm adapted 
to LDA, called the LDA ratio extractor. Associated with 
this method is a particular measure that allows us to 
compare the discriminatory power of size with that of 
shape. The second new tool, called the PCA ratio spec- 
trum, allows us to interpret the axes of a PCA in terms of 
ratios. A related method, the allometry ratio spectrum, 
is suitable for examination of the allometric behavior 
of ratios. Computation of all examples was done with 
the R statistical software, version 2.11.1 (R Development 
Core Team 2010) (for obtaining data sets and R files for 
all methods presented here, see Supplementary Material 
section). 

As mentioned in the introduction, the mathematical 
framework adopted here originates from Mosimann 
(1970) and followers. A statistical framework frequently 
used in the Earth Sciences is Aitchison's analysis of com- 
positional data, also called simplicial analysis (Aitchison 
1986; Pawlowsky-Glahn and Egozcue 2001). Typically, 
compositional data vectors have positive components 
that sum up to one: imagine, for instance, a rock com- 
posed of three minerals in proportions 20%, 50%, and 
30%. The corresponding data points (0.2, 0.5, 0.3) lie on 
a so-called simplex. The unit-sum constraint means a 
loss of 1 degree of freedom and requires special sta- 
tistical tools, many of which have been developed by 
John Aitchison and his followers. We chose not to apply 
simplicial analysis to morphometric body ratios for two 
main reasons: First, ratios do not naturally satisfy the 
unit-sum constraint. Second, ratios have a complicated 
interrelationship not present in compositional data: the 
ratios a/b and b/c completely determine the ratio a/c. 
One could, alternatively, renormalize all body measure- 
ments to unit sum and thus obtain scale-free data on a 
simplex. This would free the path to simplicial analysis. 
However, it is not obvious to us how to extract statistical 
information about ratios from these renormalized data 
in a natural way. Also, our variants of LDA and PCA in 
Euclidean space would first have to be adapted to sim- 
plicial data, and it is not obvious how to do this, either. 
For these reasons, we preferred Mosimann's framework 
to that of Aitchison. 



Standardizing the Data 

For certain multivariate methods, it is important to 
standardize the data beforehand, otherwise, larger vari- 
ables will dominate the analysis. As an example, let 
u= (ui, . . . , Up) represent vectors of body measurements 
associated with N individuals of some animal popula- 
tion. It may happen that u\, say, is many times larger 
than «2 and M3, and so the ratio M2/M3 will be largely 
dominated by the ratios U\ ju-i and W1/W3. For this reason 
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the variables u, should be transformed in a way that they 
are all in the same order of magnitude. A convenient 
way to achieve this is to divide each variable by its 
geometric population mean (Claude 2008). The trans- 
formed variables will be called y,. They and their ratios 
vary around 1. In this scale, a value of y, = 1.2, for ex- 
ample, means that the individual's corresponding body 
trait is 20% larger than the (geometric) average over 
the population (strictly speaking, this standardization is 
only crucial in PCA but has no impact on LDA). 

Space of log-ratios. — Our interpretation of results from 
statistical analysis of shape will mainly take place in the 
space of ratios (or body proportions) 

nj = y./yj- 

For p variables, there are in principle p 2 ratios; ob- 
serve, however, that only p(p — l)/2 of these are infor- 
mative and that even less, namely p — 1, can vary freely. 

The relations between ratios being of multiplicative 
nature, it is common in multivariate morphometry to 
pass to log-transformed values (Reyment et al. 1984; 
Klingenberg 1996; Claude 2008). This transformation 
allows the application of linear statistical methods and 
furthermore avoids some problems associated with the 
statistical analysis of ratios (see Hills 1978, in response 
to Atchley et al. 1976). 

We thus denote x, = log y, and 

= log r,j = log(y;/yy) = x { - x h (1) 

Following Aitchison (1983), we call the numbers d* log- 
ratios. Note that due to our standardization of the orig- 
inal data, the mean of the variables x< is zero. Also, if 
Tij w 1, we have 

dij = log (1 + fa - 1)) K.rij-1 

and thus the log-ratios roughly correspond to the devi- 
ation of the ratios from 100%. 

Shape 

As mentioned in the introduction, a ratio can be calcu- 
lated from any two body measurements and be used to 
describe the form of a specimen. A ratio thus represents 
one way for defining shape (Claude 2008). Mosimann 
(1970) generalized this particular concept of shape for 
many measurements by posing the question, "When do 
two individuals have the same shape with respect to a 
finite number of measurements?". His definitions form 
the basis for our methods and are in the following for- 
mally introduced. 

To the (standardized) body measurements y = (yj , . . . , 
y p ) T of some individual, we would like to assign a set of 
numbers encapsulating the individual's body shape. We 
assume that these numbers can be calculated by formu- 
las of the form y^y 1 ^ 2 ■ As shape values should be 



invariant under scaling Ay, the exponents must satisfy 
the shape restriction 

hi + b 2 + ■ ■ ■ + b p = 0. (2) 

Passing to the log-values x„ we define 

P(x) = log(y? 1 yp p ) = b r x (3) 

to be the shape function associated to the vector of co- 
efficients b = (b\, . . . , bp) subject to the shape restriction 
(2). We will also standardize b to length 1 (||b|| = 1). Geo- 
metrically, these constraints mean that b is a unit vector 
at right angles to the vector 1 = (1, . . . , 1) , that is, it lies 
in the p — 1 dimensional subspace 1 ("shape space") 
orthogonal to the vector 1 . If 

P = I-(ll T )/p (4) 

denotes the orthogonal projection onto the shape space 
l- 1 , then we calculate the shape values (z\, ... ,z p ) ac- 
cording to 

z = Px. (5) 

The vector b represents a direction in shape space, 
and the shape function 0(x) is the scalar product of z 
with the vector b: 

P(x) =b T x = b r z. 

Log-ratios dij are represented by the log-ratio vectors 

by = e, - e ; ', (6) 

where e, and &i are the z'-th and y'-th standard base vector 
in W. We collect these vectors to a set B = \hij}\<i<j< v . 
The fact that there are many linearly independent sub- 
sets of B spanning 1 T reflects the interdependence of 
body ratios and poses a major problem for the interpre- 
tation of statistical results in terms of body proportions. 
We will address this problem below. 

Size 

Analogous to shape functions, a size function can be 
defined. We stipulate a size function to be of the form 

yTy? ' ■ ■ ■ ' y"v ' * ms t ime the exponents fulfill the size 
restriction 

«l +«2 + • • • + Op = 1. (7) 

Thus, an individual with all body measurements 
doubled, say, will be twice as large. In terms of the 
log-values x, we define 

a(x)=log(y fl 1 1 y; p )=a r x (8) 

to be the size function corresponding to the size vector 
a= (fli, . . . Three size vectors have been commonly 
proposed in the literature: Isometric size, allometric size, 
and shape-uncorrelated size, whose definitions are pre- 
sented in the following. Shape-uncorrelated size is dis- 
cussed here for the sake of completeness. In developing 
the methodology below, our focus will be on isometric 
and allometric size. 
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Isometric size. — The "democratic" way is to give equal 
weight to all body measurements. This is tantamount 
to the choice ao = (l/p)l, and the size oto(x) = ajx is 
simply the arithmetic mean of x. In many cases, the size 
(Xo(x) and the shape values z will show significant corre- 
lation over the population. This is a sign of the presence 
of allometry. 

Allometric size. — Allometry was first observed by Cuvier 
and intensively studied by Huxley and Teissier for bi- 
variate data (e.g., body weight vs. some body trait); see 
Gayon (2000) for a short history of allometry. A gen- 
eralization to multivariate data sets was proposed by 
Jolicoeur (1963). He arrived at his definition of allomet- 
ric size in a rather heuristic way, whereas we propose 
in the Appendix a statistical model that leads to Joli- 
coeur 's generalization in a natural manner. One way to 
pass from the bivariate to the multivariate case is by 
putting forth the question: Which is the measure of body 
size fitting optimally into the set of bivariate allometric power 
laws 

y\ = dj x (bodysize) Cl , i = 1, . . . ,p, 

for suitable coefficients dj and exponents c, ? A mathemati- 
cally more precise formulation is given in the Appendix. 
The answer to this question is the size function asso- 
ciated to the size vector aj spanning the first principal 
component of the log-values x, a fact that is proved in 
the Appendix by means of the least squares method. 
More precisely, aj : = ai/l T ai where ai is the unit eigen- 
vector of the population covariance matrix Z = E(xx T ) 
corresponding to the largest eigenvalue Ai: Zaj = Xia\. 

Shape-uncorrelated size. — A choice of size function that 
represents the other extreme to allometric size was pro- 
posed by Sampson and Siegel (1985) and by 
Rao and Suryawanshi (1996). Their size vector a# has 
the property that size and shape over the population 
are uncorrelated. The shape-uncorrelated size vector is 

given by a R : = Z~h/l T L~h. 

The size vector a« is harder to interpret geomet- 
rically than aj. An interpretation is offered in Rao 
and Suryawanshi (1996): A unit increase in shape- 
uncorrelated size represents the same average increase 
(or decrease) in all the variables X\,...,x p . It is also 
proved in Rao and Suryawanshi (1996) that aj[x is the 
only size function that is stochastically independent of 
shape if x has a multivariate normal distribution. This 
was already shown by Sampson and Siegel (1985) for 
linear size functions but it holds even true for nonlinear 
size functions. 

The LDA Ratio Extractor: Selecting the Best Ratios with 
Discriminant Analysis 

As mentioned above, LDA is a standard tool in multi- 
variate morphometry. It often allows to distinguish most 
similar taxa but the numerical results obtained are then 
hard to interpret. Our aim is to adapt standard LDA in a 



way that its results admit a convenient interpretation in 
terms of the body proportions of the specimens under 
study. Our algorithm is recursive and the basic idea is 
as follows. In a first step, the ratio with the largest dis- 
criminating power is determined. Then a ratio is chosen 
that has maximal discriminating power but at the same 
time is as little correlated as possible to the first ratio. 
If needed, further ratios can be picked out in the same 
manner. 

Suppose that the values xj , X2 stem from two distinct 
groups with mean mi , ni2, and a common (nonsingu- 
lar) within-groups covariance matrix Z. Then Fisher dis- 
criminant vector w is determined by 

w oc Z _1 (mi — 1112) (9) 

and 1 1 w 1 1 = 1 . The vector w is a mixture of size and shape. 

Often taxonomists prefer to perform LDA purely 
within shape space 1^, that is to ignore the effects of 
size. Hence, the method is presented entirely in the 
shape space. The common within-groups covariance 
matrix of the shape values z, = Px„ i = 1 , 2, is given by 
Zi = PZP, which is symmetric and positive definite on 
the subspace 1 . Because it is singular in W, its pseudo- 
inverse must be used to perform the LDA. By singular 
value decomposition, there exists an orthogonal trans- 
formation matrix O in such a way that 

A : = 0 T ZiO = diag(ffi, . . . , 0,-1, 0). 

Set A + = diag(crf \ a~} v 0) and Z| = OA + O r . The 
shape discrimination vector wi is now determined by 

wj oc Z 1 P(mi — 1112). (10) 

It is hard to interpret wi in terms of body proportions 
because it is a mixture of ratios and, worse, can be writ- 
ten in infinitely many ways as a linear combination of 
log-ratio vectors (cf., formula 6) from set B. In the next 
paragraph, we develop an algorithm that extracts the 
most informative body ratios for between-groups dis- 
tinction. 

Extracting ratios. — Let x denote the combined data set in 
which both groups xi and X2 have been centered to 0 in- 
dividually. Thus, E(x) = 0 and var(x) = Z. The dominant 
log-ratio vector from B with respect to discrimination 
between groups is the one that has the largest correla- 
tion with wi in the data set x. More precisely, we con- 
sider the correlation coefficients 

|cov(b?x,w[x)| l b £ Zw i| 

C(bj;,Wl) = = 

wVar(b|x)var(w[x) Wb?Zby • w[Zwi 

and set 

bi : =arg max c(by, wi). (11) 
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The discriminating power of a vector v G W can be 
measured by the standard distance D(v), that is, the dif- 
ference of the means of v T x„ i — 1,2, divided by the com- 
mon within-groups standard deviation: 



D(v) 



|v T (mi 



m 2 ) 



{v T Lv)V 2 



(12) 



The term "standard distance" was introduced in Flury 
and Riedwyl (1986) (the square of D is sometimes called 
Rayleigh coefficient). Note that wj is the vector in l 1 - 
that maximizes D(b) among all shape vectors b e l 1 . 
By (10) and because Pwi = wi, we have for any b G 1 : 



c(b,w0 



|b T Iwi| 



(b r Zb • w[Iwi)V2 

|b r P(m! - m 2 )| 
(b r Zb • w[Zwi)V2 

= -D(b). 



|b r Ii£tP(mi - m 2 ) 
(b r Zb • w[I Wl )V2 

|(Pb) r ( mi ^m 2 )l 
(b r Zb • wflWi) 1 ^ 



Thus, we observe that bi defined in (11) has the 
strongest discriminating power among all log-ratio vec- 
tors b« G B. The highest possible standard distance 
for discrimination within size-and-shape space W is 
D to t : —D(w), where the discrimination vector w is given 
by (9). It is necessary to list the values 



D(bi, 



°-= D 



(13) 



in order to get the magnitude of the discriminating 
power of each ratio. In this listing, the log-ratio by with 
the second largest value Dy is likely to be already largely 
explained by bj due to the strong correlations between 
ratios. For this reason, we restrict the shape space 1 to 
the subspace H 2 such that the values b r x for b G H 2 are 
uncorrelated to b^x. It is easy to check that H 2 is orthog- 
onal to the vector Zbi. Projection onto H 2 is given by 
the matrix 

P 2 = I — M(M T M) _1 M T , 

where M is thep x 2-matrix M= [ao|Ibi]. Set I 2 =P 2 IP 2 
and calculate the (unit length) discrimination vector w 2 
according to 

w 2 oc ZjP^mi — m 2) 5 

where Z 2 is the pseudo-inverse of Z 2 (which has rank 
p — 2). Now, let b 2 be the log-ratio vector b y that shows 
largest correlation to w 2 . Iteration of this procedure 
leads to the following algorithm to compute the se- 
quence of ratios b;, i = 1, . . . ,p — V. 

1. Let Mj = ap and initialize k = l. 

2. Set Pfc = I — M^M^M,,)-^ and L k = P k LP k . 
Determine the pseudo-inverse Z£ and set 

w fc = £jt P it( m l - m 2)- 



3. Letb/t = argmax b , ;e B c(b,y,wjt). 

4. Add the column Lh k to the matrix M^: 



M k+1 = [aolZbj 



lib 



k- 



5. Increase k by one unit (unless i = p — 1), and 
continue at Step 2. 

In practice, only a few iterations will be performed 
because the first two or three log-ratios bi, b 2 , . . . will 
already explain most of the discrimination between the 
two groups. 

Extracting ratios for multiple groups. — Suppose we are 
given K groups (classes) xi, . . . , xr with means mi , . . . , 
mj; and a common within-groups covariance matrix L. 
The between-groups covariance matrix is defined by 



B 



^2n k (m k - m)(m k 



k=l 



where m is the total mean and n k is the number of in- 
dividuals in each group. A frequently used criterion for 
discrimination in the multiple group case is 



Q(v) 



v r Bv 
v T Iv' 



1. 
2. 



The unit vector vi maximizing Q(-) is the eigenvector 
of Z _1 B with largest eigenvalue. The generalization of 
our two-group algorithm explained above to the multi- 
ple group case is the following: 

Let Mi = ao and initialize k = l. 
Set Pfc = I — M k (M.jM. k )~ 1 M.l , L k = P k LP k and 
B k =P k BP k . Determine the pseudo-inverse l. k and 
let w; c be the eigenvector of Z^B with largest eigen- 
value. 

3. Determine 

b£Zw ;c 

bit = are max — — — . 

XeBbyZby 

4. Add the column Zb/ C to the matrix M^: 

M, + i = [a 0 |Zbi|...|Zb,]. 

5. Increase k by one unit and continue at Step 2. 

The philosophy behind this algorithm is exactly the 
same as in the two-group case: First, we determine the 
linear discriminant and choose the log-ratio vector 
by with strongest correlation to w k . Then we project to 
a subspace of shape vectors that are uncorrelated to all 
log-ratio vectors that have already been chosen. Again, 
two or three iterations will be sufficient in practice. 

Judging the influence of size. — As mentioned above, the 
LDA ratio extractor was developed in the shape space 
that is convenient for most circumstances. Sometimes, 
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however, it might be informative to know how well par- 
ticular groups are separated in relation to size. In order 
to assess how much of the total separation is due to size, 
we define D size : =D(a 0 )/D tot and D shape = D(w t )/D tot . 
One can then view the number 



as a measure of how well size discriminates in compari- 
son with shape. 



The PCA Ratio Spectrum: Interpreting Principal 
Components with Ratios 

PCA is a very widely used method in multivariate 
statistics (Jolliffe 2004). In contrast to LDA, specimens 
are not assigned to different groups for a PCA but are 
treated as a single group. The resulting scatterplots can 
then be used to explore the structure of variation in this 
group. It might be the case that the pattern recovers 
groupings based on other sets of characters (qualitative 
morphology, molecular markers, etc.), which would 
give them additional weight. Usually, individual princi- 
pal components are interpreted in terms of the original 
variables (see Jolicoeur and Mosimann 1960 and Manly 
2005 for lucid examples). The method developed below 
allows an interpretation using ratios. The main ingre- 
dient of this method is a diagram that we call the PCA 
ratio spectrum. It allows the user to immediately read 
off the dominant ratios as well as their interrelationships 
(recall that ratios are always interdependent in a com- 
plex fashion as their number is larger than the degree of 
freedom in the data). 

The technical details of this method and its theoret- 
ical justification are presented below. Let the random 
vector x with £(x) = 0 and cov(x) = Z (assumed to be 
nonsingular) represent body measurements of a given 
population. The first principal components vector ui = 
(Mz)/=1,...,p of the shape values z = Px is the eigenvector 
of £i = PZP corresponding to the largest eigenvalue Aj 
ofli: 

£iui =Ai • m. 

For a log-ratio vector by, we have 

cov(b^x, u[x) = b^ZiUj = A • b^ui = A • («; — «,). (15) 

This fact allows a simple graphical interpretation of the 
first principal component u in terms of body propor- 
tions: The numerical values (coefficients) of the com- 
ponents of ui are drawn as points on the real line. 
We call this diagram the PCA ratio spectrum of the 
vector ui. To a pair of points «,, Uj on the spectrum 
with a large difference corresponds a body proportion 
log (i/j/i/j) that contributes substantially to the first prin- 
cipal component; on the other hand, close points on 
the spectrum contribute little. The PCA ratio spectrum 
represents a mixture of all body proportions and shows 
how much each of them contributes to the variation in 



relation to the others. This can be illustrated with the 
example given in Figure 2b. As can be seen by their 
comparable separation in the spectrum, the ratios gaster 
breadthigaster length and postmarginal veimtergum 7 length 
have similar explaining power for the variance. On the 
other hand, the ratio eye breadthiscape length has no ex- 
planatory power because the corresponding points are 
very close in the spectrum. 

If desired, the same procedure can be applied to the 
second and following principal components. Let us em- 
phasize again that the method can only be applied in 
a statistically consistent manner when a PCA is per- 
formed within the shape space. 

Statistical stability of the PCA ratio spectrum. — Sometimes 
it might be useful to test whether the PCA ratio spec- 
trum is statistically stable. Instability occurs when the 
largest eigenvalue Aj is not sufficiently distinct from 
the smaller eigenvalues of Zi, though this rarely might 
be the case in practice. In order to obtain confidence 
intervals for the points m, on the PCA ratio spectrum 
we assume that the values x and hence z are normally 
distributed. More precisely: Let zj, . . . , z n be a random 
sample created from a multivariate normal distribution 

AA(0,Zi). Denote by Zi the sample covariance matrix 
and by uj the standardized first principal components 

vector of Zj, pointing in the same half -space as uj. 
The sampling distribution of ui is complicated but An- 
derson has established its large-sample distribution 
(see theorem 13.5.1 in Anderson 2003). It follows from 
this result that for sufficiently large sample size n, the 
marginal distribution of the f-th component of the ran- 
dom vector u.i is approximatively normally distributed 
according to w, ~ Af(uj, cr?) where 

^ = 7)] n \ \2 u tk- ( 16 ) 
n ^ (Ai - A k y 

Here, Ai > A2 > • • • > A p _i are the positive eigen- 
values of the matrix Zi (which has rank p — 1) and 
w, jt are the elements of the matrix U = (ui| . . . |u p _i) 
formed by the corresponding standardized eigenvec- 
tors ui, . . . , u p _i. (The eigenvector u p corresponding to 
A p = 0 is proportional to the isometric size vector ao.) 
Graphically, we represent the 68% confidence intervals 
[m, — cr,, w, + cr,] as perpendicular bars of length 2a, at 
the corresponding point «, on the spectrum (Fig. 2b). 
If the interval lengths are not too large compared with 
the separation of the points on the spectrum — as is the 
case in Figure 2b — then the spectrum can be considered 
as statistically stable. Even when the normal assump- 
tion is violated, the confidence intervals still give some 
indication of the stability of the spectrum. 

Alternatively, one can also sample the original val- 
ues z directly from the empirical distribution and obtain 
similar intervals with a bootstrap. The latter was used 
for estimating the confidence intervals in Figure 2b. 
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The Allometry Ratio Spectrum: Assessing Allometric 
Behavior of Ratios 

The idea of a ratio spectrum introduced above is also 
useful for extracting body ratios that show allometric 
behavior. For a given size vector (like an or aj), the body 
ratio that shows the most distinctive allometric growth 
can be interpreted as the one whose covariance with the 
body sizes a r x is maximal. We obtain 

cov(b|x, a r x) = b|Za = di — dj, 

where we have set Za=:d=(di)i=i i ... J p. Hence, exactly as 
in the preceding paragraph, the body proportions with 
strongest allometric growth along the size vector a can 
be read off the allometry ratio spectrum of d. A rea- 
sonable choice of a size vector is Jolicoeur's size vector 
aj. In that case, we have d oc aj and thus the allomet- 
ric body proportions can be directly determined by the 
spectrum of aj. An illustration of such a spectrum is 
given in Figure 4. 

Results 

Discriminating Species 

As an illustration of how to apply the LDA ratio ex- 
tractor, we revisit a statistical analysis from Baur (2002) 
where morphometric data from two species of parasitic 
wasps were examined, namely the species Pteromalus 
albipennis Walker, 1835 and P. solidaginis Graham and Gi- 
jswijt, 1991 from the Pteromalus albipennis group (Insecta: 
Hymenoptera: Chalcidoidea). The analysis is based on 
p = 23 characters (called "head breadth," "OOL," "eye 
height," etc.) measured on n\ = 32 individuals from 
P. albipennis (Group 1) and tt2 = 19 individuals from 



P. solidaginis (Group 2), see Baur (2002) for a complete 
description. The common within-group variance is esti- 
mated by 



n\ + n2 n\ + tt2 

where Z.\, T.\ are the estimated covariance matrices of 
the two groups. 

Before performing LDA, we would like to add a word 
of caution rarely mentioned in the textbooks: If the to- 
tal number of individuals n = n\ + U2 is not distinctly 
larger than the number p of body traits, the results from 
an LDA can be completely spurious. The reason is that 
the dimension is large enough that a separating plane is 
likely to exist between the two groups even if the sam- 
ple points are completely random. As a rule of thumb, 
one should always have n > 2p + y/p. A theoretical justi- 
fication of this rule is given in MacKay (2003, p. 490). 

By applying the LDA ratio extractor introduced in the 
Methodology section, we obtain OOL:gaster length as 
the most discriminating ratio. We get D S i 2e = 0.064 and 
Dshape = 0.964, hence 6 = 0.063 (cf., formula 14). Thus, 
discrimination between the groups stems mostly from 
shape differences. The next discriminating body ratio 
being as little correlated as possible with OOL:gaster 
length is eye breadthimarginal vein. Its standard distance 
Djj (see formula 13) is 2.1 as compared with the standard 
distance Dy = 5.6 for the first ratio. As can also be seen 
from the scatterplot in Figure la, the discriminating 
power as compared with the first ratio is already much 
lower. Figure lb shows the next two ratios extracted 
from the algorithm, funicle 1 lengthipropodeum length and 
scape lengthipostmarginal vein, with standard distances 
Djj = 2.3 and D« = 1.7, respectively. By looking at the 
plots in Figure la and b, one could be tempted to simply 



a) b) 




OOL / gaster length funicle 1 length / propodeum length 

FIGURE 1. Scatter plots of the four most discriminating ratios for Pteromalus albipennis (dots) and P. solidaginis (triangles). Plot (a) shows first 
versus second ratio, plot (b) third versus fourth ratio. 
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combine the first (OOL:gaster length) with the third ratio 
(funicle 1 lengthipropodeum length) to arrive at an even 
better separation of groups. However, one should bear 
in mind that these ratios are highly correlated and there- 
fore stand for more or less the same information. 



Interpreting Principal Components 

Figure 2a shows the results of a PCA on the same 
data set, but this time the two Pteromalus species were 
entered in the analysis as a single group. A PCA is 



always useful for examining the structure of variation 
in a single population, for instance, when it is diffi- 
cult to assign specimens to different groups beforehand 
(Pimentel 1979; Reyment et al. 1984; Claude 2008). It 
can also give additional weight for groupings based 
on other features. In this case, the specimens in the 
scatterplot were labeled as either P. albipennis or P. sol- 
idaginis according to qualitative character differences, 
such as coloration or forewing pilosity, and host plant 
association (see Graham and Gijswijt 1991). As can be 
seen from Figure 2a, the first principal component is 
fully congruent with the separation of species. For the 
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malar space 
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pedicel plus flagellum 
head height 
stigmal vein 

marginal vein 



tergum 7 breadth 
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gaster breadth / tergum 7 length 



tergum 7 length 
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upper face 

postmarginal vein 



propodeum length 
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FIGURE 2. Application of the PCA ratio spectrum using the Pteromalus data, with Pteromalus albipennis (dots) and P. solidaginis (triangles), (a) 
Scatterplot of a principal component analysis (PCA) in shape space, (b) PCA ratio spectrum of the first principal component. The ratio formed 
from the extremal points (i.e., gaster breadthtergum 7 length) explains a large part of the variation of the first component. In contrast, ratios 
formed from characters lying close to each other in the spectrum (e.g., marginal vein:postmarginal vein) explain very little. This is apparent in the 
scatterplot (c). Confidence intervals (horizontal bars in (b), see Methodology section) were estimated with a bootstrap. 
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-0.10 -0.05 0.00 0.05 0.10 0.15 0.20 



isometric size 

FIGURE 3. Scatterplot of isometric size versus first principal 
component in shape space for the Pteromalus data set, with Pteroma- 
lus albipeimis (dots) and P. solidaginis (triangles). The mean size of P. 
solidaginis is obviously smaller but it still lies within the range of Ptero- 
malus albipennis. 



interpretation of this component, the PCA ratio spec- 
trum is displayed in Figure 2b. Most of the variation 
is explained by ratios like gaster breadthitergum 7 length 
that correspond to points lying at the opposite end of 
the spectrum. On the other hand, ratios formed from 
characters lying adjacent to each other in the spectrum, 
like marginal veimpostmarginal vein, explain very little. 
This is visualized in the scatterplot of the two ratios 
(Fig. 2c). Of course, also the ratio spectra of the second 
and third principal component could be drawn and 
sometimes this might be illuminating as well, for in- 
stance, for explaining the structure of variation within 
each species. 

The above analysis exemplifies the use of our method- 
ology in the shape space. Sometimes a researcher might 
be interested to examine differences in the size of the 
specimens, for instance, for investigating the influence 
of ecological parameters or different food regimes on 
populations (McCoy et al. 2006). Here, one could sim- 
ply plot the isometric size axis (see Size section above) 
against the first principal component in shape space. 
From Figure 3 it is evident that the mean size of Ptero- 
malus solidaginis is smaller, but that its range still lies 
within P. albipennis. 

Assessing Allometry 

We will illustrate the use of the allometry ratio spec- 
trum on a classical data set of specimens of the purple 
rock crab Leptograpsus variegatus (Fabricius, 1793) (Crus- 
tacea: Brachyura: Grapsidae) from Western Australia 
(see Campbell and Mahon 1974). These occur in two 



a) 

o — 1 5 _i 5 

m o o ij. tz 



b) 
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FIGURE 4. The allometry ratio spectrum for the Leptograpsus varie- 
gatus data set for blue type males (a) and for orange type males (b) re- 
spectively. The characters shown are carapace length (CL) and width 
(CW), width of frontal lobe (FL), rear width (RW), and body depth 
(BD) (see Results section). The bars do not represent confidence inter- 
vals here. 



color forms, blue and orange. Mahon collected 50 in- 
dividuals from each color form and from each sex and 
made five body measurements: carapace length (CL) 
and width (CW), width of frontal lobe (FL), rear width 
(RW), and body depth (BD). We calculated the allomet- 
ric size vectors aj for the body measurements of the 
males of both the blue and the orange morph. Figure 4 
shows the corresponding allometric ratio spectra for 
both morphs. As can be seen, the ratio BD:RW shows 
the largest allometric growth whereas for CL:CW al- 
lometry is negligible in both groups. Figure 5 confirms 
this conclusion: There we display a scatter plot for the 
orange type males of the isometric sizes versus the log- 
ratios of BD:RW and CL:CW, respectively. Whereas the 
first ratio (Fig. 5a) visibly has a strong correlation with 
isometric size, as is characteristic for allometry as ex- 
plained in the Methodology section, this is much less 
the case for the second ratio (Fig. 5b). 

It is useful to test allometry versus isometry that is, to 
test the null hypothesis that aj = ag. Such a test, under 
the hypothesis of normality and relatively large sam- 
ple size, was developed by Anderson (2003) (see section 
11.6.2). Adapted to our situation, the P value of the null 
hypothesis is given by Prob(Xp_j > k) where the test 
value k is determined by 

k = n(pAiaoI _1 a 0 + pA^aglao - 2). 

Here, L is the covariance matrix of the sample x of size 
n and Aj is its largest eigenvalue. For the male Leptograp- 
sus, the P values are virtually zero for both color types, 
hence the null hypothesis that no allometry is present 
can safely be rejected. 

Discussion 

As initially mentioned, a number of body measure- 
ments are commonly collected in taxonomic research. 
This mainly serves two purposes. First, the raw or 
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FIGURE 5. Scatter plots of isometric size versus log-ratios body depth-.rear width (a) and carapace length-.width (b) for the orange type males in 
the Leptograpsns variegatus data set. 



log-transformed variables are entered in some kind 
of standard multivariate statistical analysis (MVA) for 
studying character variation and for discrimination of 
taxa. PCA and LDA are among the methods of choice 
in this respect and are the ones we refer to with MVA 
below. Second, the same measurements are integrated 
in descriptive works, but this time by calculating ratios 
(indeed, the numerical output from MVA would be far 
too awkward for inclusion in descriptions and identifi- 
cation keys). Of course, it would be most useful if, say, 
a discriminant function could be interpreted in terms 
of ratios that then could be directly used for a species 
description. One could, for instance, expect some guide- 
lines for the choice of ratios. So far, this was not possible 
because the two kinds of analysis were not directly com- 
parable (see below). Thus, ratio analysis usually adheres 
to certain standards established for a particular group, 
rather than following the insights gained from MVA. 
A case in point is the study of the Encarsia meritoria 
species complex (Insecta: Hymenoptera: Aphelinidae) 
by Polaszek et al. (2004), where some of the best ratios 
used for species discrimination were not even included 
in their elaborate PCA and LDA. 

The incompatibility of MVA and ratio analysis results 
from the way, size and shape functions are defined for 
each method (see Fig. 6 for further details). However, 
the methods presented here, namely the newly devel- 
oped LDA ratio extractor and the PCA ratio spectrum, 
solve these problems by using the same definitions for 
size and shape. Therefore, the results from MVA can 
now be interpreted in terms of ratios that, in turn, can 
be directly incorporated in a variety of descriptive taxo- 
nomic works. In fact, a more sophisticated use of ratios 
may be achieved, as is demonstrated by our application 
of the LDA ratio extractor to the data set from parasitic 
wasp species of the family Pteromalidae. Here, the best 



ratios found for separating the two Pteromalus species 
were OOL (distance of lateral ocellus to eye margin) -.gaster 
(abdomen) length, funicle 1 (antenna) length:propodeum 
length, etc. (see Results section and Fig. 1). These ratios 
relate characters from widely separated body parts and 
differ from those commonly used in the taxonomy of 
pteromalid wasps. For instance, in Graham (1969), still 
the standard reference in the field (Grissell and Schauff 
1997), ratios are exclusively formed from characters ly- 
ing adjacent to each other, like eye height -.breadth or thorax 
length-breadth (see also Graham and Gijswijt 1991). Ev- 
idently, the variation of such ratios among specimens 
can — to a certain extent — be judged by eye. However, as 
demonstrated here, these ratios are apparently not the 
best ones for discrimination. It is of course very diffi- 
cult if not impossible to judge by eye the discriminating 
power of ratios based on widely separated characters, 
a task that is best done analytically with the help of an 
algorithm such as the one presented in this paper. 

The present methodology can thus easily be embed- 
ded in a consistent statistical frame work for the mul- 
tivariate analysis of morphometric data. In particular, 
it allows us to interpret the results of a PCA and LDA 
entirely in terms of ratios, which themselves form the 
core information of most quantitative taxonomic works. 
The important point of the new methodology is to de- 
termine the shape values and to choose a particular 
size vector beforehand. For the size function, we mainly 
considered the isometric size vector ag, except for the 
allometry ratio spectrum, which relates to Jolicoeur's 
allometric size vector aj. Of course, other definitions 
of shape and size are possible (see Bookstein 1989 for 
a review). By using the "back-projection" method of 
Burnaby (1966), some authors (e.g., Klingenberg 1996; 
McCoy et al. 2006) choose to define their shape values 
by projecting the log-data x on the space orthogonal to 



2011 



BAUR AND LEUENBERGER— ANALYSIS OF RATIOS 



823 



a) 




Standard principal component analysis 
b) 




-0.5 0.0 0.5 

1 . principal component 



-0.2 -0.1 0.0 0.1 0.2 

2. principal component 



C) 



Isometric size and principal component analysis in shape space 

d) 




— i 1 1 1 1 1 1 — 

-1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 

isometric size axis 



x&y 



i i i i i r 

-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 

1 . principal component in shape space 



FIGURE 6. Scatterplots of principal component analyses (PCA) of a single species of Pteromalus (n = 32 specimens of Pteromahis albipennis, 
p = 23 variables of body measurements; data from Baur 2002), showing the effect of different definitions of size and shape. Specimen labeled 
y is a clone of specimen x but with all variables scaled by a factor of 1.4. The two specimens have therefore equal values for all their ratios 
and are only separated along the isometric size axis, as indicated by the line connecting x with y. (a) Scatterplot of first against second and (b) 
of second against third component respectively of a standard PCA on the covariance matrix of log-transformed data. The first component is 
considered as a general size measure because its coefficients have the same sign and are of similar magnitude for all variables. However, they 
are not exactly the same, thus the first component of a standard PCA is usually considered as the allometric size axis (Jolicoeur 1963; Claude 
2008). The remaining components define the shape space in this analysis. Note that the line of isometry is not parallel to the first component, 
and, thus, reflects the different size measures. As a result, specimens x and y are also widely separated points in the shape space, although 
viewed from their body proportions they are identical. For (c) and (d) the same data were used, but here they were subjected to a PCA after 
removal of isometric size (for details of computation, see the Methodology section). Now, the line of isometry connecting x with y lies of course 
parallel to the isometric size axis (c). In the shape space (d) the two specimens form a single point, because only those specimens appear distinct 
which also differ in body proportions. 



the allometric vector aj. The reason for this is to trans- 
form away shape effects related to allometric growth. 
According to this view, size is represented by the first, 
shape by all the following principal components of the 



log-data. It is, however, unclear how these shape values 
could be properly interpreted in terms of body propor- 
tions; in particular, no ratio-spectrum can be assigned 
in a mathematically consistent way to "shape" vectors 
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orthogonal to aj. Moreover, the allometric growth law in 
its bivariate or multivariate versions is just a convenient 
statistical model and by no means a "law of nature" 
(Gould 1966). In our opinion, allometry should rather 
be treated as a hypothesis to be tested after the size 
values are determined rather than be incorporated into 
the framework from the very beginning. We therefore 
prefer to analyze allometric variation with help of the 
allometry ratio spectrum, as demonstrated above (see 
Results section). 

Our new methods are obviously rooted in the field of 
multivariate morphometries (Reyment et al. 1984). The 
latter is occasionally dubbed traditional morphometries 
(Marcus 1990), as opposed to "modern morphometries" 
(Claude 2008) such as the analysis of landmarks (geo- 
metric morphometries, Adams et al. 2004; Zelditch et al. 
2004) or outlines (e.g., elliptic Fourier analysis, Lestrel 
1989, 2000). The main reason why we stay within multi- 
variate morphometries is simply given by the nature of 
our data. Landmark and outline data are ideally suited 
for fixed objects, such as a skull or the body of a fish. 
For an insect with articulated extremities, those meth- 
ods are of limited use unless one is willing to study the 
form of the head, thorax, or wings in separate analyses. 
This can and should be done. Nevertheless, it is often 
useful to include measurements from all over the body 
in a single analysis. For instance, a taxonomist trying 
to distinguish between two most similar species will be 
happy about any discriminating character. What if they 
are best separated by the ratio of, say, the length of the 
hind leg and the eye height? As we have shown above, it 
is here where methods of multivariate morphometries, 
adapted for the analysis of ratios, could play a major 
role. 

Supplementary Material 

Supplementary material can be found at http: / / www. 
sysbio.oxfordjournals.org/ . 
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APPENDIX 

Statistical Derivation of the Allometric Size Vector 

We would like to arrive at an estimation of the allometric size vector start- 
ing from a statistical model of the allometric growth hypothesis. Let y be the 



original data and oc(y) — Ofr=i a s * 2e faction. According to Huxley (1932), 
each trait y, when graphed against the individual's size cc(y) should satisfy the 
power law 

y, = di- a(y) c <', t = l,...,p. (Al) 

Here we consider d\ as positive random variables and c, as constant coefficients. 
We shall use the approach of least squares to statistically estimate the coefficients 
Oi and c, . Taking logarithms on both sides of (Al), we get 

P 

Xi = Cj^a k x k + \h + ei, 
k=i 

where m — E(log d,) and E(ef) — 0. In vector notation, this reads 
x — (a T x)c + |X + £ 

with E(e) — 0. Because E(x) — 0, we conclude — 0. We estimate a and c in a 
way that the sum of squares is minimal: 

(a, c) — argmin a S(a, c), 

where S(a, c) — E|| e|| 2 . We have 

S(a,c) - E||x- (a r x)c|| 2 

— E[x T x + (a r x) 2 (c T c) — 2(a r x)(c T x)] 

— E(x T x) + (c T c)a T Ia — 2a T Lc. 
Calculating vector derivatives with respect to a and c we get 

— S(a, c) = 2(c T c)2Ia - 2Zc 
da 

and 

0 j 
— S(a, c) — (a Ia)c — la. 
dc 

Setting both equations equal to 0 and dropping the hats over a and c, we arrive 
at the system of equations: 

(a T Ia)c — La, 
(c T c)La = T.c. 

Multiplying the second equation from the left by solving for c and plug- 
ging the result into the first equation, one can see that a is an eigenvector of L 
with eigenvalue 

a T La 



and c — a/||a|| 2 . Replacing these results in S(a, c) one gets: 

S(a, c) — E(x r x) — A. 

Evidently, this expression is minimal if A is the largest eigenvalue of Z . Let ai 
denote the unit vector representing the first principal component of the data x. 
Imposing the size restriction, we arrive at the solution 

a ; = a x /l T ai 

and cj — a;/ 1 1 a; 1 1 2 . Historically, Jolicoeur (1963) was the first to introduce a mul- 
tivariate generalization of Huxley's allometric power law and he proposed our 
a/ as a measure of size (or rather ai to be precise). He did not, however, give a 
statistical model to motivate his definition. 



